Communities and beyond: mesoscopic analysis of a large social network with 

complementary methods 
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Community detection methods have so far been tested mostly on small empirical networks and 
on synthetic benchmarks. Much less is known about their performance on large real-world net- 
works, which nonetheless are a significant target for application. We analyze the performance of 
three state-of-the-art community detection methods by using them to identify communities in a 
large social network constructed from mobile phone call records. We find that all methods detect 
communities that are meaningful in some respects but fall short in others, and that there often is 
a hierarchical relationship between communities detected by different methods. Our results suggest 
that community detection methods could be useful in studying the general mesoscale structure of 
networks, as opposed to only trying to identify dense structures. 
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I. INTRODUCTION 

Large complex networks have different levels of organi- 
zation. On the microscopic level networks are composed 
of pairwise interactions, but it is the macroscopic level 
that has received most attention in recent years. We now 
know that diverse networks exhibit similarities for ex- 
ample in degree distribution, average path length, and 
clustering coefficient. While the structure is interesting in 
its own, it also has a significant influence on the dynamic 
processes taking place on the network, such as spreading, 
diffusion, and synchronization [l|-[3|. 

The intermediate mesoscopic scale has turned out be 
more elusive to describe. It is this scale where we can 
identify for example motifs [1, [H and dense clusters of 
nodes commonly known as communities. Although com- 
munities are relevant for understanding the structure of 
and the dynamics on networks, even their exact defini- 
tion is still a controversial issue. Thus it comes as no 
surprise that the art of community detection has grown 
into a swarming field of diverse methods [6]. Many fea- 
tures of real- world networks add to the complexity of the 
task. Real networks are often hierarchical and hence small 
communities may reside inside larger ones, communities 
may overlap if nodes participate in several communities, 
and even more complications arise if we take into account 
link weights that represent interaction intensity. 

Until recently, the performance of community detec- 
tion methods has mainly been tested on small empir- 
ical networks with typically no more than 100 nodes, 
which allows the evaluation of quality by visual inspec- 
tion. However, several networks of considerable interest 
are much larger, often with 10^ nodes or more: data on 
WWW, mobile phone call records, electronic footprints of 
instant messaging users, and networks of social web such 
as Facebook etc. Only few methods are efficient enough to 
handle such networks [7|-[Tq| — to be successful, a commu- 
nity detection method must be computationally efficient 



in addition to being accurate. 

More systematic comparisons have been recently car- 
ried out using synthetic benchmark networks with built- 
in community structure [ll|, |T2| . While benchmarks are 
useful in evaluating performance, even their authors ac- 
knowledge that they only represent the first step. No 
benchmark fully incorporates the spectrum of properties 
commonly observed in real- world networks. Some recent 
benchmarks do allow heterogeneous distributions for de- 
grees and community sizes, but many other properties are 
still missing, such as high clustering, existence of cliques 
[l3| , overlapping communities Q 5 assortativity [l5| , and 
the prevalence of motifs [l6| . This distorts the evaluation 
of algorithms that depend on (or benefit from) the exis- 
tence of these features. For example, clique percolation 
has been successfully used on real- world networks [l3|,[I3- 
[lot but does not perform well on synthetic benchmarks — 
mainly due to its strict requirement for communities to 
consist of adjacent cHques 
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In this paper we take three widely-applied methods, 
each based on a different underlying philosophy, and com- 
pare their performance on a large real-world social net- 
work constructed from mobile phone call records. Unlike 
with benchmark networks, we do not know the "correct" 
community structure of the network. Therefore, we in- 
troduce new measures that allow us to investigate the 
differences and similarities of the detected community 
structures. 

The paper is organized as follows. Section ITTl describes 
the choice of community detection methods and Section 
[TTTl introduces the data set. Section [iVlpresents the results 
of our analysis where we first analyse the properties and 
statistics of individual community structures and then 
turn to a pairwise comparison to quantify the differences 
between communities. Finally in Section |Vl we present 
conclusions. 
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II. CHOICE OF METHODS 

As we intend to study a large network, the first se- 
lection criterion is only practical: methods with running 
time 0{N'^) or slower cannot be included. We use three 
methods that not only fill this requirement but in addi- 
tion have performed well in previous comparisons or in 
practice: the Louvain method (LV) [9|Lthe Infomap (IM) 
[2Q| and the clique percolation (CP) |13l |. 

We consider an undirected network G = (V, E)^ where 
V is the set of N nodes and E the set of L edges. 
The degree hi is the number of neighbors node i has, 
ki = \{j\{i^j) G E}\. For mathematical purposes a com- 
munity c is simply a set of nodes, c C y, and we de- 
note community size by S = \c\. The communities de- 
tected by one method constitute a community structure 
C = {ci, . . . , Cn^}' A partition P is a special community 
structure where each node belongs to exactly one com- 
munity, i.e. Q n Cj = if i 7^ j and |jr=i ~ 

All three methods can be extended to handle weighted 
networks where each edge has a numerical weight Wij. 
In this paper we only consider positive weights; Wij = 
is equivalent to (i, j) ^ E. The weighted counterpart of 
degree is node strength: Si = ^(^ij^^E^^j- 

The Louvain method (LV) [9] was the best of the mod- 
ularity optimization methods tested in [8] . Modularity is 
the expected value of the difference of the number of 
edges inside communities in the actual network and in a 
random network with the same degree sequence [2l| : 
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where Lc is the number of edges inside community c and 
dc = ^i^c total degree. In the weighted version all 
quantities are replaced by their weighted counterparts: 
Lc by the total sum of weights inside a community, dc 
by the sum of node strengths and L by the sum of edge 
weights in the whole network. 

Because modularity optimization is an NP-complete 
problem [22], LV uses a greedy heuristic to find a lo- 
cal optimum. Each node is initially a separate commu- 
nity, i.e. Ci = {i}. Neighboring communitie^ are merged 
in random order so that modularity increases maxi- 
mally at each step until a local maximum is reached. 
Resulting communities are then shrunk into "super- 
nodes" and the optimization is repeated on the new 
"renormalized" network. The two steps — optimization 
and renormalization — are repeated recursively until no 
further improvement of modularity is possible. 

The local heuristic of LV seems to avoid some of the 
resolution issues of modularity. In addition, the renor- 
malized networks can be understood as different levels of 
a hierarchical community structure. 



The Infomap method (IM) [20^ came out on top in a re- 
cent state-of-the-art benchmark comparison 0. The idea 
is to describe a random walker with a two-level coding 
scheme: the higher level has a single codebook for com- 
munities, on the lower level each community has its own 
codebook with a special exit code for moving out of the 
current community. The optimal partition corresponds to 
the codebook with the minimum description length: too 
small communities increase the description length due to 
higher frequency of community crossings, while commu- 
nities containing too many nodes require longer descrip- 
tion. In weighted networks the random walks are biased 
towards edges with higher weight. Since an exhaustive 
search for the optimal partition is not feasible, Infomap 
employs a heuristic similar to the one used in LV. 

Clique percolation (CP) [l3| has been successfully ap- 
plied to large empirical graphs, e.g. to study the dy- 
namics of social groups [13 • A /c-clique is a fully con- 
nected subgraph of k nodes, and two /c-chques are con- 
sidered adjacent if they share k — 1 nodes. As the name 
suggests, clique percolation defines communities as con- 
nected /c-clique components: a CP community is a maxi- 
mal set of /c-cliques such that there is a path of adjacent 
/c-cliques between them. Different values of k yield dif- 
ferent community structures, and communities obtained 
with a larger value of k reside inside those obtained with 
a smaller value. To select the best value of k we follow 
Ref. [l3| and use the smallest value for which there is no 
giant percolating community. 

There are significant differences between CP and the 
other two methods. Both LV and IM use a stochastic op- 
timization scheme while CP is entirely deterministic. In 
addition, LV and IM yield a partition but CP does not. 
With CP the nodes that do not belong to any /c-clique 
are left outside communities, and if a node belongs to 
several /c-cliques it may belong to more than one over- 
lapping community. The fact that CP does not provide 
a partition is not necessarily a bad thing: sparse regions 
of the network do not appear as communities, and e.g. in 
social networks individuals often do belong to multiple 
groups, such as family, friends, and colleagues. 

To define the weighted clique percolation (wCP) [23| 
we need the concept of clique intensity^ defined as the 
geometric mean of edge weights. In wCP we use a value 
of k that would give a giant community in the unweighted 
case, but only include those /c-cliques that have intensity 
larger than some predefined threshold />. Analogously 
to the unweighted case, /> is set to the largest value for 
which there is no giant community. 

Notes on applying the methods are given in Appendix 

m 



III. THE DATA 



Our empirical test network is a mobile phone call net- 

^ Two communities are neighbors if there is at least one hnk be- work constructed from billing records of seven million 

tween them. customers of a single mobile phone operator whose cus- 
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tomer base covers about 20% of the population in its 
country. The records cover a period of 126 days. To en- 
sure anonymity of customers, phone numbers have been 
replaced by surrogate keys. Data from the same operator 
has been previously studied in [13, [HI . 

For this study we use only voice calls, and only those 
that take place between customers of the operator in 
question. In addition we exclude edges where only one 
person has made calls to the other during the whole pe- 
riod. We study only the largest connected component 
which has N = 4.9 x 10^ nodes and L = 10.9 x 10^ edges 
(mean degree {k) ^ 4.44)0 

The edge weights in the weighted network are defined 
as sums of call durations (in seconds) between the two 
customers. The average weight is (w) ~ 4634 seconds. 

Using a large social network enables us to relate the 
findings to known characteristics of such networks (2^ . 
It is known that the overlap of local neighborhoods of ad- 
jacent nodes increases with edge weighl|j as conjec- 
tured in the "weak ties" hypothesis of Granovetter |27| . 
This feature should be reflected in correlations between 
edge weights and communities. We can also study struc- 
tural features of communities and evaluate whether they 
represent meaningful social communities. 



IV. RESULTS 

We analyse single community structures detected by 
each method. Both LV and IM are stochastic methods 
and therefore give a slightly different partition on every 
run; however, as shown in Appendix [B] the qualitative 
properties of the communities are stable enough to justify 
the comparison. 

Appendix [A] contains detailed notes about the appli- 
cation of the three methods. In brief, we use parameters 
A; = 3 for CP and A: = 4 with /> 3093 for wCP— these 
are the only two methods with explicit parameters — and 
with LV we only study the first level of the hierarchical 
community structure since other levels yield communities 
that are implausibly large in the social context. 



A. Community size distributions 

Figure [T] shows the community size distributions for 
all methods. All distributions are broad, as suggested by 
previous results [Tol, [13, [llj . 

For IM, the tail of the size distribution appears power- 
law-like. Very small communities are rare. The commu- 
nity structure of wIM is notably different. The weighted 



^ The largest connected component contains 92 % of nodes and 98 
% of edges; the second-largest component has only 47 nodes. 

^ Except for the very largest edge weights, where the relation is 
reversed. 



communities are smaller, and the distribution is now 
monotonously decreasing. 

Even though the largest LV communities are an order 
of magnitude smaller than in IM, LV still produces larger 
communities than its weighted variant wLV. Both LV 
and wLV have monotonous community size distributions, 
and small communities are more prevalent than in IM. 
The power-law exponents for the tails are similar when 
comparing LV to IM and wLV to wIM. 

For CP and wCP the size distributions are well approx- 
imated by a power law. This is expected, as the commu- 
nities are detected close to the critical point where a gi- 
ant community would emerge. The largest deviation from 
power law behaviour is in the tail. The largest wCP com- 
munities are larger than those in CP because 3-cliques are 
used for wCP and 4-cliques for CP. Although these com- 
munities partially overlap (see Section ITVEp . the 3-clique 
communities extend far beyond the 4-clique communities. 



B. Visual observation of small communities 

The qualitative properties of small communities can be 
estimated visually, similarly to evaluating performance 
on small empirical networks. Fig. [2] shows archetypal 
communities with S = 5, 10, 20, and 30, and their im- 
mediate network surroundings. Communities larger than 
this tend to be too complex to visualize in two dimen- 
sions. 

Of all unweighted methods the CP communities are 
the least surprising: larger communities naturally appear 
only in dense parts of the network. Small LV communi- 
ties consist of interconnected cliques, which coincides well 
with the general idea of social groups. The smallest IM 
communities with 6* < 10, however, are typically treelike 
and located at the "edge" of the network - these commu- 
nities are attached to the rest of the network by only few 
links. LV covers these sparse parts of the network with 
much smaller communities (see Fig. [9]). 

When the weights are taken into account, the partition- 
based methods wIM and wLV tend to produce even more 
treelike communities that have the appearance of local 
"backbones" of the network. This is a natural conse- 
quence of the way wIM and wLV use edge weights; how- 
ever, communities like these do not coincide well with the 
idea of dense social groups. 

C. Community density distribution 

Since some small communities were already observed 
to be treelike, we turn to more quantitative characteriza- 
tion of community density. Graph density is normally de- 
fined as the proportion of edges out of all possible edges, 
Lc/ [^S{S — 1)]. However, since communities are neces- 
sarily connected it is more illustrative to study density 
relative to the sparsest possible community, a tree with 
S — 1 edges, as also done in [lo[: we define density as 



4 




Figure 1. (Color online) Community size distributions for IM, LV and CP and their weighted versions. The parameter a denotes 
the exponent when the tails are fitted a power-law distribution P{S) oc S^] solid lines correspond to the unweighted a and 
dashed lines to the weighted ■ 
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Figure 2. (Color online) Typical (left) unweighted and (right) weighted communities of different size. These communities have 
been manually selected from a large random sample of communities with the intention of portraying archetypal examples. 
Colored (dark gray) nodes and edges denote nodes inside a single community, and the light gray nodes are the first neighbors of 
the nodes in the community. In weighted communities the edge width in is proportional to the logarithm of edge weight, with 



the restriction that edges with Wij 
width. 



< 300 (5 min) have the minimum width and those with Wij > 14400 (4 h) the maximum 



Dc = Lc/ - 1). In general 1 < Dc < S/2 where the 
lower bound corresponds to trees and the upper bound 
to cliques. CP however doesn't allow trees; instead, the 
smallest possible density is reached when each new node 
adds only k — 1 edges. In this case Lc = (2) +(A^ — 1)(*S' — /c) 
which gives Dc > {k - 1){S - |)/ {S - 1). For 5' > A; 
this is approximately k — 1. 

Figure [3] shows the distributions and average values 
of Dc as function of community size. As expected, CP 
yields the densest communities. For IM the value of Dc 
stays close to 1 until ^ 20, which confirms the obser- 
vation on the prevalence of small treelike communities. 
For LV the distribution has a curious bimodal shape in 
the range 20 < 6* < 50: typical LV communities of this 
size have Dc from 2 to 4, but there is a small number of 



LV communities that are trees {Dc = 1) but none that 
are almost trees. A closer inspection (not shown) of these 
trees reveals that they are stars. 

The plots for weighted communities in Fig. [3] suggest 
that weights make the communities more similar across 
methods. Both wIM and wLV communities are more tree- 
like, as already seen in Section HVBi 

Treelike communities do not fit well either with the 
idea of social groups, or that of communities in general 
being dense groups of nodes. However, if a network con- 
tains treelike regions, partition-based methods will cor- 
respondingly yield treelike communitie^, as also seen in 



^ It has been shown that if there are nodes with a single link, 
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Figure 3. (Color online) The distribution of relative density 
Dc — Lc/{S — 1) for communities from each method. In all 
plots, each column represents a distribution and is normalized 
to one, the colors indicating probability density so that the 
darker the color, the higher the density (see color bar). The 
thick solid line denotes the average value. The dashed straight 
line corresponds to cliques, for which Dc — S/2. For IM and 
LV the smallest density is 1, which corresponds to trees. For 
CP, the smallest possible density is indicated by the curved 
dashed line (see text). 



Ref. [lo[. The abundance of treelike parts may just be 
a sampling artifact, as our network does not cover the 
whole population. Nevertheless, empirical data is rarely 
perfect, and a good community detection method should 
deal with this in a sensible way. One could argue that in 
treelike regions the network is so sparse that there isn't 
enough information about community structure. This 
makes CP's requirement — that nodes must participate in 
at least one clique to be assigned a community — appear 
meaningful. On the other hand, CP may yield communi- 
ties where cliques are arranged as chains or starlike pat- 
terns, which again does not coincide well with the idea of 
social groups. Fig. [3] indicates that in CP and wCP there 
are indeed some communities with densities close to the 
lower bound. 



for modularity optimization they should always belon g to the 
community of the node to which they are connected [29|. By 
construction, this holds for IM as well. 



Figure 4. (Color online) The distribution of p{c) (Eq. ^ as 
function of community size for each method. The distributions 
are presented as in Fig.[3l with a similar shading scheme. The 
black line denotes average value. 



Whatever the interpretation, the detected treelike 
structures do provide information about the mesoscopic 
structure of the network. In other networks starlike struc- 
tures can represent meaningful communities: for example 
in air transport networks the peripheral airports are con- 
nected to local hubs 1301. 



D. Intra- and intercommunity edges 

If the detected partitions are any good, nodes should 
have more edges to other nodes in the same community 
than to those in other communities. To measure this we 
define p{c) as the ratio of total out- and in-degree of a 
community: 



P{c) 



1 

2l: 



(2) 



Figure m shows the distribution of p{c) as function of com- 
munity size. With respect to this measure IM produces 
the most clear-cut communities: majority of IM commu- 
nities have p below one. The values for small communities 
are especially low, confirming the earlier observation that 
small IM communities are on the "edges" of the network. 
LV communities also have p < 1 on average, except for 
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Table I. Edge weights inside and between communities. 
{w) denotes the average edge weight in the whole network, 
(wc) the average weight for edges inside communities and 
iwc-c) between communities. CP also has non-community 
nodes; {wc-n) denotes the average weight between commu- 
nity and non-community nodes and (wn-n) between two non- 
community nodes. 





{Wc)/{w) {W 


.-c)/{w) 


{Wc-n)/{w) 


{Wn-n)/{w) 


IM 


1.14 


0.69 






LV 


1.20 


0.78 






CP 


1.20 


0.57 


0.80 


1.06 


wIM 


1.65 


0.18 






wLV 


1.92 


0.25 






wCP 


2.57 


0.43 


0.57 


0.73 



Granovetter hypothesis As nodes inside communi- 
ties have overlapping neigbour hoods, we expect the links 
between communities to be on average weaker than those 
within communities. Table |T] shows that with all methods 
this is indeed the case. With weighted methods this re- 
sult is of course not as surprising since weights were used 
in identifying the communities. 

To see beyond averages, Fig. [5] displays the normal- 
ized average edge weight inside communities as function 
of community size. Most notably the edge weights in 
the largest communities are below the network average — 
even for wIM and wLV. 



E. Neighbourhood overlap 




Figure 5. (Color online) Average edge weights Wij / (w) inside 
communities as a function of community size S, normalized 
by the network average. 



the smallest communities, but the values are not as low 
as with IM. Including weights increases the average value 
of p. wLV communities in fact have on average more links 
going outside the community than inside. 

Because CP allows nodes to belong to multiple com- 
munities, a good community need not have a low value of 
p{c). Also note that with CP a large fraction of edges are 
attached to non-community nodes. For CP (wCP) only 
21.4 % (18.6 %) of edges and 21.8% (25.4 %) of nodes are 
inside communities; 47.6 % (43.0 %) of edges are between 
non-community nodes. 

From earlier studies of mobile phone call networks 
[23, [m we know that there is a correlation between edge 
weight and neighborhood overlap, in agreement with the 



Neighbourhood overlap quantifies the similarity of a 
node's neighbourhood in two community structures. If 
Mi{Cj) is the set of those neighbours of node i that be- 
long to its community in Cj, the neighborhood overlap 
is defined as Jaccard index of MiiCi) and A/'i(C2): 



Oi{Ci,C2) 



|M(Ci)nM(C2)| 
|M(Ci)uM(C2)| 



(3) 



Thus Oi = 1 \i the same neighbours of i belong to its 
own community in both methods and = if the sets 
do not overlap. In the case of CP we only consider nodes 
that participate in at least one community; for nodes 
that participate in several, we assign the node to the 
community where most of its neighbours reside. 

Figure [6] displays the average neighbourhood overlap as 
function of degree for selected method pair^. Nearly all 
pairs show a decreasing trend and thus in general commu- 
nity neighbourhoods of low-degree nodes are more sim- 
ilar. The IM-CP and wIM-wCP overlaps decrease the 
fastest, as the underlying philosophies are different and 
the large number of nodes not appearing in any CP com- 
munity reduces the overlap. wIM and wLV show a better 
match than their unweighted counterparts, suggesting a 
similar and fairly strong response to edge weights. On 
the other hand, overlaps for IM-wIM and LV-wLV be- 
come small for large /c, which suggests that taking weights 
into account considerably changes the partitions for these 
methods. With CP-wCP the opposite behaviour occurs 
because wCP is based on 3-cliques and many nodes that 
are included in a 3-clique are not included in any 4-clique. 



F. Nested communities 

The above analysis shows that the three methods do 
not detect the same communities. It is however possi- 
ble that they only detect different levels of a hierarchical 



^ Instead of showing the results for all 15 method pairs we only 
present the most interesting cases. 
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Figure 6. (Color online) Average neighbourhood community 
overlap O as a function of node degree /c, between different 
methods (top) and unweighted and weighted versions of the 
same method (bottom). 



community structure. If this is true, then the communi- 
ties from one method should be the subset of another. 

To address this question quantitatively we calculate 
how accurately a single community c' G Pi can be tiled 
by the communities of another partition Pj. The best 
tiling is reached with set T C Pj that minimizes the sum 
of external faults 



El 



\c'ncj\ 



(4) 



which equals the number of nodes in T but outside c', 
and internal faults 



Fint(c',T) = |c'|- E l^'^^^-l 



(5) 



which equals the number of nodes in c' but outside T. As 
illustrated in Fig. [71 the minimum of Fext+^int is reached 
when T contains only those communities for which fl 
Cj\ > ^|cj|, i.e. those Cj G Pj that share at least half of 
their nodes with c^ To allow comparing communities of 
different size we define tiling imperfection T{c\ Pj) as the 
ratio of this minimum total fault and community size: 



I{c',Pj) 



min(Fext + -Pint) 



(6) 



Note that the aim of this measure is to quantify the 
subset-superset relationships of communities, which can- 
not be done with symmetric measures such as mutual 
information. 




T = {ci, C2} 
F^^, = 2 (A) 



Figure 7. (Color online) Illustration of tiling imperfection. The 
8 nodes in c are spread over three different communities in 
another partition. Using T = {01,02} gives the best tiling; 
including 03 would reduce Fint to but increase Fext by 2. 
The value of tiling imperfection is X = 3/8. 
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Figure 8. (Color online) (Top) Tiling imperfection X and in- 
clusion imperfection X* between IM and CP. (Bottom) Tiling 
imperfection T between IM and LV. 



It is possible to generalize this measure also for general 
community structures H such as the one produced by CP, 



If T* = Uj^TCj, the generalized tiling is defined by Fext(c^ T) = 
|T*|-|T*nc'| and Fint(c',T) = |c'| - |T*nc'|. The optimal T can 
now be constructed by first including (as before) the communities 
that share at least half of their nodes with c' , but then adding 
also those communities that contain more uncovered nodes of 
(i.e. those in c'\T*) than new nodes outside c^ Here we however 
use the same definition of T as for partititions to make the values 
more comparable. 
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a) b) 




Figure 9. (Color online) Typical cases of tiling with IM (red 
or dark gray) and LV (black) communities of size S = 10. 
Light gray nodes are the first neighbors of the community 
to be tiled, (a) Example of perfect tiling X — when IM 
community (red nodes) is tiled with LV communities (black 
edges). A typical IM community with 5* = 10 is located in a 
treelike region of the network, and LV covers such regions with 
very small communities, (b) Example of tiling imperfection 
X = 1 when LV community (black edges) is tiled with IM 
communities (in red). A typical LV community with 5* = 10 
is in a somewhat denser part of the network, where the IM 
communities are much larger. 

but this is not advisable: if c' would have nodes that are 
not included in any community of Cj , these nodes would 
automatically be internal faults and the tiling imperfec- 
tion would be misleadingly high. To correct for this we 
define inclusion imperfection X*{c'^Cj) similar to tiling 
imperfection but nodes may be counted as internal faults 
only if they are covered by both community structures. 

Results for tiling measures are shown in Figure [H Com- 
paring the tiling and inclusion imperfections for IM-CP, 
especially for small communities, illustrates the differ- 
ence of these two measures: tiling imperfection is high 
since small IM communities are treelike and therefore not 
included in any CP community; low values of inclusion 
imperfection, however, show that CP communities tend 
to be subsets of IM communities. High values of CP-IM 
tiling imperfection shows that the reverse is not true. 

The low tiling imperfection for IM-LV and high for 
LV-IM shows that IM communities tend to be supersets 
of LV communities. The extreme values for small commu- 
nities indicate that nearly all small IM communities can 
be perfectly tiled with LV communities, while small LV 
communities can almost never be tiled with IM commu- 
nities III A typical tiling of small IM and LV communities 
is shown in Fig. [9l 

V. CONCLUSIONS AND DISCUSSION 

Benchmarks are helpful if the methods are to be tested 
for sensitivity to particular properties, such as hierarchi- 



^ Note that X may only take values that are fractions of community 
size: e.g. with S = h the smallest non-zero value is 0.2, and to get 
an average value of O(10~^) the vast majority of IM communities 
must have X = 0. 



cal structure or broad distribution of community sizes. 
Real- world networks, however, are incomparably more 
complicated, often inhomogeneous in many respects and 
usually contain many different kinds of mesoscopic struc- 
tures. Good performance on benchmark graphs does not 
assure that communities identified in real data are mean- 
ingful. Our analysis of the Infomap, Louvain and clique 
percolation methods applied to a large social network re- 
veals that while all the three methods do detect reason- 
able communities in some respects, they still come short 
in others. 

With all these methods the edge weights were higher 
inside communities than between them, in accordance 
with the Granovetter hypothesis [l^; distributions of 
community sizes were broad, as expected; and tiling im- 
perfection revealed that while IM and LV produce differ- 
ent partitions, they have a hierarchical relation where LV 
communities tend to be inside IM communities. On the 
other hand, both IM and LV yield treelike communities 
which does not coincide well with the notion of a social 
community, and using edge weights makes the commu- 
nities even sparser. In contrast, CP clusters are always 
found in dense regions of the graph and are therefore of- 
ten meaningful; as a downside CP may end up discarding 
some important parts of communities. 

A natural question is how well our findings can be gen- 
eralized to other types of networks. Analysis of multi- 
ple datasets is beyond the scope of this work, but some 
speculation can be done. Broad community size distribu- 
tions have already been observed in a number of studies 
0, EqI, 0. Considering the numerous treelike commu- 
nities, similar sparse regions occur in other networks as 
well. For example, [10] found that the density of com- 
munities can vary widely across different network types; 
e.g. the Internet has very sparse communities while infor- 
mation networks (like arXiv citations) have dense ones. 
The similarity of IM and LV may hold too because both 
partition the network and their heuristics are similar. [lo| 
observed that two very different partitioning methods re- 
sulted in similar communities in terms of statistical prop- 
erties. On the other hand, the difference between CP and 
the partition-based methods is likely to manifest itself for 
various networks. 

In large sparse networks partitioning methods in- 
evitably identify some questionable regions as commu- 
nities. The trees, starlike formations and stars detected 
by IM and LV do, however, bear mesoscopic structural 
meaning: they too are building blocks of the network. The 
same topological structure may be considered a commu- 
nity for one purpose but not for some other — a star is 
hardly a social community but may reasonably be con- 
sidered as one in for example biochemical networks [lo|- 

It would seem that the analysis of large empirical net- 
works would benefit from the use of complementary com- 
munity detection methods and a comparison of the iden- 
tified structural features. Instead of just devising ever 
more efficient community detection methods it might be 
more beneficial to take into consideration the existence 
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of different types of mesoscopic structures, as opposed to 
fixating on a predefined idea of dense communities. 



ACKNOWLEDGMENTS 

The project ICTeCollective acknowledges the finan- 
cial support of the Future and Emerging Technologies 
(FET) programme within the Seventh Framework Pro- 
gramme for Research of the European Commission, un- 
der FET-Open grant number: 238597. We acknowledge 
support by the Academy of Finland, the Finnish Center 
of Excellence program 2006-2011, project no. 129670. JK 
thanks OTKA K60456 and TEKES for partial support. 
We thank Albert-Laszlo Barabasi for the data used in 
this research. 



Appendix A: Notes on applying the methods 

The Louvain method. The LV agglomeratively builds 
larger communities until no improvement in modularity 
can be achieved. Our data yielded very large communities 
with sizes up to 5* 5 x 10^ nodes both for LV and wLV, 
and hence we adopted the view that the different renor- 
malization levels correspond to different levels of hierar- 
chical organizatioE0, as suggested in Ref. j9|. To obtain 
meaningful, smaller social communities and to be able to 
compare results with other methods we chose to use the 
first level, i.e. before the first merger of communities was 
made. This step revealed another feature of LV: while 
the modularity value is quite similar regardless of the 
order the nodes are processed in, the size of the largest 
community varies greatly. We use a partition where the 
size of the largest community is around 10^ since this 
makes sense in the social context. Because LV uses a lo- 
cal heuristic and we are dealing with a very large network, 
it is reasonable to assume that the statistical properties 
of the partitions are on average similar and do not vary 
as much as the size of the largest community. For a de- 
tailed description of the stability of both LV and IM, see 
Appendix [BJ In addition the LV algorithm can in some 
cases produce disconnected communities. Only few such 
communities were encountered, and we dealt with this 
by turning each connected component into a community. 
Code for the algorithm is available for download [31^. 

The Infomap method. The implementation code for In- 
fomap is available for download [32|. No changes to the 
code were required. 

Clique percolation. For CP we need to select the value 
of k such that there is no percolating cluster. For our 
data, k = 3 gives rise to a giant community but k = A 
does not and thus we select k = 4. 



Note, however, that this assumption has not yet been verified 
e.g. with benchmarks. 




fraction of cUques 



Figure 10. (Color online) To find the critical threshold Ic for 
wCP we build up communities by adding cliques in descending 
order of intensity /, and monitor the largest component size 
m(/>) (□) and susceptibility x(^>) {^)- The transition occurs 
when about 24 % of cliques have been added (/> ~ 3093). 



Table II. Running times of the different algorithms on our 
data set of iV = 4.9 x 10^ nodes and L = 10.9 x 10^ links. 





unweighted 


weighted 


Louvain 


2 min 7 s 


1 min 30 s 


Infomap 


46 h 44 min 


3 h 20 min 


Clique percolation 


2 min 10 s 


4 min 52 s 



For the weighted wCP we start with k = 3 and find 
the threshold intensity /> for which the giant commu- 
nity disappears H Thus we look for the percolation point 
using clique intensity as the control parameter [23] and 
set the intensity threshold /> slightly below the critical 
point. This point can be identified by the maximum of 
the susceptibility-like quantity 

where S is community size, and a and /3 index the com- 
munities. We varied /> while monitoring the order pa- 
rameter m{Iy) and the susceptibility x(^>) (see Figure 
[TQ|) . When 24 % of the cliques have been added in or- 
der of descending intensity, a giant cluster emerges, while 
susceptibility shows a pronounced peak. This point cor- 
responds to the critical intensity Ic ^ 3093, which was 
chosen as our threshold. For CP and wCP, we applied the 
fast algorithm introduced in [33|. A sample implementa- 
tion can be found at [sj 

The running times of all the algorithms used are dis- 
played in Table [TTl LV and CP are extremely fast, while 
Infomap takes a few days to complete. All runs were done 
on a standard desktop machine, utilizing a single proces- 
sor. 



^ Note that with k = 4 the weighted communities would be iden- 
tical to the unweighted ones, as in the absence of percolation the 
intensity threshold would be set to 0. Using k = 2 on the other 
hand would correspond to simply using a weight threshold on 
single edges. 
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Table III. Comparison of the stability of stochastic algorithms. 
We generated 20 partitions with each method using differ- 
ent random seeds, and present the smallest and largest ob- 
served values of \Pi\ and 5'max over all 20 runs and of /pm^ 
over the 20 ordered pairs {Pi, Pj) with \i — j\ = 1. The value 
of /pm = \Cpm{{Pi}ili)\/\Pj\ depends on the partition only 
through \Pj I and is therefore also very stable; we list the value 
corresponding to the largest \Pj\. 







*S'max 


/•pair 
/pm 


/•all 
/pm 


IM 


280000 - 


280516 


2964 - 


3672 


42.1 


- 42.6 % 


13.2 % 


LV 


1293903 - 


1298256 


811 - 


11390 


72.1 


- 72.8 % 


36.7 % 


wIM 


674587 - 


674727 


209 - 


247 


97.4 


- 97.5 % 


92.5 % 


wLV 


1155557 - 


1155985 


73 - 


112 


95.8 


- 95.9 % 


90.1 % 



Appendix B: Stability of the stochastic methods 

Both IM and LV are stochastic methods, and therefore 
the partitions produced by different runs will not be iden- 
tical. To see how stable the algorithms are we run each 
method 20 times with different random seeds to generate 
partitions Pi = {cj^i}, i = 1, . . . , 20, and study the sta- 
bility of the number of communities found (|Pi|), the size 
of the largest community (6'max = and the 

stability of identified communities across the runs. Let 
V = {Pi,P2,---} be a set of partitions and denote by 
Cpm('P) = Hp^-pP the set of communities that appear in 
all partitions, i.e. the set of perfectly matching commu- 
nities. For any P^ G P the fraction of perfect matches is 
/pm(Pz;P) = |Cpm|/|Pi|. We denote by f^^' the fraction 
of perfect matches when V consists of two partitions, and 
by /pj^ the fraction of perfect matches when V consists 
of all 20 partitions generated by a single method. 

The results are summarized in Table lllli It turns out 
that both weighted methods are very stable not only with 
respect to \Pi\ and 5'max, but also with respect to the 
identity of communities: with both wIM and wLV we get 
/pm > O-S, which means that over 90 % of communities 
are identical in all 20 runs. The variation comes mostly 
from large communities. 

In the unweighted case both IM and LV are stable 
with respect to |Pi|, and IM also with respect to 5'max- 
The identity of communities found however exhibits more 
variation: e.g., only 13 % of communities found by a sin- 
gle run of IM appear in all 20 runs. Furthermore, looking 



at the unmatched communities for any pair (i.e. those in 
Pi\Cpm({Pi,Pj})), in IM about 32 % have tiling imper- 
fection X < 0.2, and the average tiling imperfection is 
0.46; in LV only 17 % of such communities have X < 0.2, 
with average tiling imperfection of 0.57. Thus the remain- 
ing communities are in general not even close matches. 
As with weighted methods, small communities are more 
likely to match perfectly than larger ones. 

Instability of a method is of course problematic for 
anyone wanting to identify the "true" communities of 
a given network. It is however premature to judge IM 
and LV because of this: the network topology is inher- 
ently noisy, and does not necessarily contain enough in- 
formation to uniquely identify the communities. Includ- 
ing weights made both methods much more stable, which 
suggests that the link weights contain information be- 
yond the network topology. Note that there is informa- 
tion even in the instability: any two IM partitions share 
42 % of their communities, but if these shared communi- 
ties were chosen uniformly at random only 0.42^^ ^ 10~^ 
% of the communities would appear in all 20 partitions — 
much less that the actual value of 13.2 %. 

The high stability of wIM and wLV may be partly 
explained by the fat-tailed distribution of call lengths 
in a mobile call network [26|. Since both methods are 
based on using probabilities proportional to the edge 
weights, an edge with a weight several orders of mag- 
nitude larger than the average will be placed inside a 
community almost independently of the network topol- 
ogy. On the other hand, in wCP the definition of inten- 
sity as the geometric average takes well into account the 
fat-tailed degree destribution, and is equivalent to using 
weights w^j = log Wij^ the arithmetic mean for inten- 
sity and the intensity threshold = log/>. While one 
could use logarithmic weights also with wIM and wLV, 
this is problematic as the ratio of log- weights is not scale 
invariant and therefore the result would depend on the 
unit used to measure call length. 

Finally, as suggested by the stability of \Pi\ and 5max, 
the qualitative properties of the communities are very 
stable even though the exact identity of communities are 
not. For example IM repeatedly produces treelike com- 
munities even if the communities are not made up of the 
same nodes. Because of this statistical stability no error 
is made by comparing the methods by using only single 
realizations from each method. 
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